Extending Synchronization Constructs in OpenMP to Exploit Pipeline Parallelism on Heterogeneous Multi-core

نویسندگان

  • Shigang Li
  • Shucai Yao
  • Haohu He
  • Lili Sun
  • Yi Chen
  • Yunfeng Peng
چکیده

The ability of expressing multiple-levels of parallelism is one of the significant features in OpenMP parallel programming model. However, pipeline parallelism is not well supported in OpenMP. This paper proposes extensions to OpenMP directives, aiming at expressing pipeline parallelism effectively. The extended directives are divided into two groups. One can define the precedence at thread level while the other can define the precedence at iteration level. Through these directives, programmers can establish pipeline model more easily and exploit more parallelism to improve performance. To support these directives, a set of runtime interfaces for synchronization are implemented on the Cell heterogeneous multi-core architecture using signal block communications mechanism. Experimental results indicate that good performance can be obtained from the pipeline scheme proposed in this paper compared to the naive parallel

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unifying Barrier and Point-to-Point Synchronization in OpenMP with Phasers

OpenMP is a widely used standard for parallel programing on a broad range of SMP systems. In the OpenMP programming model, synchronization points are specified by implicit or explicit barrier operations. However, certain classes of computations such as stencil algorithms need to specify synchronization only among particular tasks/threads so as to support pipeline parallelism with better synchro...

متن کامل

Performance Characteristics of OpenMP Language Constructs on a Many-core-on-a-chip Architecture

Recent emerging many-core-on-a-chip architectures present massive on-chip parallelism through hardware support for multithreading. In order to achieve fast development of parallel applications that exploit this massive intrachip parallelism to achieve highly sustainable performance, suitable programming models are needed. OpenMP, the industry de facto standard for writing parallel programs on s...

متن کامل

Expressing DOACROSS Loop Dependences in OpenMP

OpenMP is a widely used programming standard for a broad range of parallel systems. In the OpenMP programming model, synchronization points are specified by implicit or explicit barrier operations within a parallel region. However, certain classes of computations, such as stencil algorithms, can be supported with better synchronization efficiency and data locality when using doacross parallelis...

متن کامل

Synchronization-Free Parallel Collision Detection Pipeline

We present a first parallel and adaptive collision detection pipeline running on a multi-core architecture. This pipeline integrates a first global synchronization-free parallelization of its major steps and enables to dynamically adapt the parallelism repartition during the simulation. We propose to break the sequentiality of the pipeline by simultaneously executing the two main phases (broad ...

متن کامل

Reducing the Effects of Branch Misprediction through Dynamic Heterogeneous Core Scheduling

The magnitude of a branch misprediction penalty is highly dependent on processor pipeline depth. Although longer pipelines may allow higher clock frequency, because less logic is performed in each stage, a fundamental trade-off exists between increased clock speed and accumulated misprediction penalties. Some programs, with easily predictable branches, will perform optimally on long pipelines. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011